Should you use your data warehouse as your CDP?
There's a case for and against using your data warehouse as a customer data platform. Here are three ways to make it work.
The advent of cloud-based data warehouses (DWHs) has brought simpler deployment, greater scale and better performance to a growing set of data-driven use cases. DWHs have become more prevalent in enterprise tech stacks, including martech stacks.
Inevitably, this begs the question: should you employ your existing DWH as a customer data platform (CDP)? After all, when you re-use an existing component in your stack, you can save resources and avoid new risks.
But the story isn’t so simple, and multiple potential design patterns await. Ultimately, there’s a case for and against using your DWH as a CDP. Let’s dig deeper.
DWH as a CDP may not be right for you
There are several inherent problems with using a DWH as a CDP. The first is obvious: not all organizations have a DWH in place. Sometimes, an enterprise DWH team does not have the time or resources to support customer-centered use cases. Other enterprises effectively deploy a CDP as a quasi-data warehouse. (Not all CDPs can do this, but you get the point.)
Let’s say you have most or all your customer data in a DWH. The problem for many, if not most, enterprises is that the data isn’t accessible in a marketer-friendly way. Typically, an enterprise DWH is constructed to support analytics use cases, not activation use cases. This affects how the data is labeled, managed, related and governed internally.
Recall that a DWH is essentially for storage and computing, which means data is stored in database tables with column names as attributes. You then write complex SQL statements to access that data. It’s unrealistic for your marketers to remember table and column names before they can create segments for activation. Or in other words, DWHs typically don’t support marketer self-service as most CDPs do.
This also touches on a broader structural issue. DWHs aren’t typically designed to support real-time marketing use cases that many CDPs target. It can perform quick calculations, and you can schedule ingestion and processing to transpire at frequent intervals, but it is still not real-time. Similarly, with some exceptions, a DWH doesn’t want to act off raw data, whereas marketers often want to employ raw data (typically events) to trigger certain activations.
Finally, remember that data and the ability to access it don’t maketh a CDP. Most CDPs offer some subset of additional capabilities you won’t find in a DWH, such as:
- Event subsystem with triggering.
- Anonymous identity resolution.
- Marketer-friendly interface for segmentation.
- Segment activation profiles with connectors.
- Potentially testing, personalization and recommendation services.
A DWH alone will not provide these capabilities, so you will need to source these elsewhere. Of course, DWH vendors have sizable partner marketplaces. You can find many alternatives, but they’re not native and will require integration and support effort.
Not surprisingly, then, there’s a lot of chatter about “composable CDPs” and the potential role of a DWH in that context. I’ve argued previously that composability is a spectrum, and you start losing benefits beyond a certain point.
Having issued all these caveats, a DWH can play a role as part of a customer data stack, including:
- Doing away with a CDP by activating directly from the DWH.
- Using the DWH as a quasi-CDP with a reverse ETL platform.
- Coexisting with a CDP.
Let’s look at these three design patterns.
1. Connecting marketing platforms directly to your DWH
This is perhaps the most extreme case I critiqued above, but some enterprises have made this work, especially in the pre-CDP era and platforms (like Snowflake with its broad ecosystem) are looking to try to solve this.
The idea here is that your engagement platform directly connects to push-pull data with a DWH. Many mature email and marketing automation platforms are natively wired to do this, albeit typically via batch push. Your marketers then use the messaging platform to create segments and send messages to those segments in the case of outbound marketing.
Imagine you had another marketing or engagement platform, a personalized website or ecommerce platform. Again you draw data from DWH, then employ the web application platform to create another set of segments for more targeted engagement.
Do you see the problem yet? There are two sets of segmentation interfaces already. What happens if you had 10 marketing platforms? 20? You will keep creating segments everywhere, so your omnichannel promise disappears.
Finally, what if you had to add another marketing platform that did not support direct ingestion from a DWH?
2. Employ DWH with reverse-ETL tools
This approach solves several problems with the first pattern above. Notably, it allows (in theory) a non-DWH specialist to create universal segments virtually atop the DWH and activate multiple platforms. With transformation and a better connector framework, you can apply different label mappings and marketer-friendly data structures to different endpoints.
Here’s how it works. Reverse ETL platforms pull data from the DWH and send it to marketing platforms after any transformation. You can perform multiple transformations and send that data to several destinations simultaneously. You can even automate it and have exports run regularly at a predefined schedule.
But a copy of that data (or a subset of it) is actually copied over to target platforms, so you really don’t have just a single copy of data. Since the reverse-ETL platform does not have a copy of data, your required segments or audiences are always generated at query time (typically in batches). Then you export them over to destinations.
This is not a suitable approach if you want to have real-time triggers or always-on campaigns based on events. Sure, you can automate your exports at high frequency, but that’s not real-time. As you increase your export frequency, your costs will exponentially increase.
Also, while reverse-ETL tools provide a segmentation interface, they tend to be more technical and DataOps-focused rather than MOps-focused. Before declaring this a “business-friendly” solution suitable for marketer self-service, you must test it carefully.
3. DWH co-exists with CDP
Your enterprise DWH serves as a customer data infrastructure layer that supplies data to your CDP (among other endpoints). Many, if not most, CDPs now offer some capabilities to sync from DWH platforms, notably Snowflake.
There are variations in how these CDPs can co-exist with DWH. Most CDPs sync and duplicate data into their repository, whereas others (including reverse-ETL vendors) don’t make a copy. However, there could be trade-offs you need to consider before finalizing what works for you.
In general, we tend to see larger enterprises preferring this design pattern, albeit with wide variance around where such critical services as customer identity resolution ultimately reside.
Dig deeper: Where should a CDP fit in your martech stack?
Wrap-up
DWH platforms play increasingly essential roles in martech stacks. However, you continue to have multiple architectural choices about which services you render within your data ecosystem.
I think it’s premature to rule out CDPs in your future. Each pattern has its trade-offs to keep in mind while evaluating your options.
Contributing authors are invited to create content for MarTech and are chosen for their expertise and contribution to the martech community. Our contributors work under the oversight of the editorial staff and contributions are checked for quality and relevance to our readers. The opinions they express are their own.
Related stories
New on MarTech